Sensitivity of Predictors in Educational Data: A Bayesian Network Model
نویسندگان
چکیده
This research investigates the application of Bayesian Networks to predict causal relationships in a dataset that captures several demographic and academic features of a group of students from a four-year public university. This educational dataset is characterized by both quantitative and qualitative variables, some of which exhibit a strong pair-wise dependence. To identify this dependence, a factorial analysis of the mixed data is conducted that allows consideration of both variable types to result in a new coordinate space that captures the variance of the data with fewer dimensions. This exploratory stage enables visualization of groups of dependent variables that may be applied for predicting outcomes of interest. It also provides a validation of the results of the Bayesian network (BN) structure modeling. The BN is learnt using bootstrapped arc strength averaging to derive a graphical relationship between variables with arcs represented by a persistence parameter representative of their occurrence in the learning process. The resulting network is shown to be characterized by two major relatively independent structures; one formed by college academic performance metrics and the second by financial, housing and student demographic variables. The prediction accuracy of the BN is evaluated using evidence from pre-college and on-going college variables. The pre-college evidence is found not to be sensitive to college degree completion outcome with only 55% accuracy rate. The on-going college evidence however, improves the prediction accuracy by 75%.
منابع مشابه
Provide a Predictive Model to Identify People with Diabetes Using the Decision Tree
Background: Today, in most hospitals in Iran, there is an extensive database of patient characteristics that includes a large amount of information related to medical, family and medical records. Finding a knowledge model of this information can help to predict the performance of the medical system and improve educational processes. Methods: Data mining techniques are analytical tools that are...
متن کاملRisk Analysis of Operating Room Using the Fuzzy Bayesian Network Model
To enhance Patient’s safety, we need effective methods for risk management. This work aims to propose an integrated approach to risk management for a hospital system. To improve patient’s safety, we should develop flexible methods where different aspects of risk and type of information are taken into consideration. This paper proposes a fuzzy Bayesian network to model and analyze risk in the op...
متن کاملA Bayesian Approach to Estimate Parameters of a Random Coefficient Transition Binary Logistic Model with Non-monotone Missing Pattern and some Sensitivity Analyses
A transition binary logistic model with random coefficients is proposed to model the unemployment statues of household members in two seasons of spring and summer. Data correspond to the labor force survey performed by Statistical Center of Iran in 2006. This model is introduced to take into account two kinds of correlation in the data one due to the longitudinal nature o...
متن کاملAssessment of Artificial Neural Network Models and Maximum Entropy in Zoning of Gully Erosion Sensitivity of Golestan Dam Basin
Zoning of gully erosion susceptibility and determining the factors controlling gully erosion is very important and vital. The aim of this study was to investigate the spatial distribution of gully erosion using two models of ANN and MaxEnt and to determine the factors affecting this type of erosion in Golestan Dam basin. Therefore, 14 factors in the form of three divisions, including topographi...
متن کاملتعیین عوامل خطرزا و ارایه مدل پیشآگهی آمبولی ریه بیماران بستری با استفاده از شبکههای بیزی
Background and Objectives: Pulmonary embolism is a potentially fatal and prevalent event that has led to a gradual increase in the number of hospitalizations in recent years. For this reason, it is one of the most challenging diseases for physicians. The main purpose of this paper was to report a research project to compare different data mining algorithms to select the most accurate model for ...
متن کاملA Model for Tax Evasion Forcasting based on ID3 Algorithm and Bayesian Network
Nowadays, knowledge is a valuable and strategic source as well as an asset for evaluation and forecasting. Presenting these strategies in discovering corporate tax evasion has become an important topic today and various solutions have been proposed. In the past, various approaches to identify tax evasion and the like have been presented, but these methods have not been very accurate and the ove...
متن کامل